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(57) Abstract 

The invention relates to a method 
for implementing a memory. The mem- 
ory is implemented as a directory struc- 
ture comprising a tree-shaped hierachy 
having nodes at several different levels, 
wherein an individual node can be (i) a 
trie node comprising an array wherein an 
individual element may contain the ad- 
dress of a lower node in the tree-shaped 
hierarchy and wherein an individual ele- 
ment may also be empty, the number of 
elements in the array corresponding to a 
power of two, or (ii) a bucket containing 
at least one element so that the type of 
an individual element in the bucket is se- 
lected from a group including a data unit, 
a pointer to a stored data unit, a pointer to 
a node in another directory structure and 
another directory structure. To optimize 
storage space occupancy and memory ef- 
ficiency, in at least part of the directory 
structure sets of successive trie nodes are replaced with compressed nodes in such a way that an individual set made up by successive trie 
nodes, from each of which there is only one address to a trie node at a lower level, is replaced with a compressed node (CN) storing an 
address to the node that the lowest node in the set to be replaced points to, information on the value of the search word by means of which 
said address is found, and information on the total number of bits from which search words are formed in the set to be replaced. The 
invention also relates to a structure in which buckets are not employed. 
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Method for implementing an associative memory based on a 
digital trie structure 

Field of the Invention 

The present invention generally relates to implementation of an 
5 associative memory, particularly to implementation of an associative memory 
based on a digital trie structure. The solution in accordance with the invention 
is intended for use primarily in connection with central memory databases, and 
it can be used in conjunction with all memories based on a digital trie structure. 

1 o Background of the Invention 

The prior art unidimensional directory structure termed digital trie 
(the word 'trie" is derived from the English word "retrieval") is the underlying 
basis of the principle of the present invention. Digital tries can be implemented 
in two types: bucket tries, and tries having no buckets. 

15 A digital bucket trie structure is a tree-shaped structure composed 

of two types of nodes: buckets and trie nodes. A bucket is a data structure 
containing a number of data units or a number of pointers to data units or a 
number of search key/pointer pairs (the number may include only one data 
unit, one pointer or one key/pointer pair). A trie node, on the other hand, is an 

20 array guiding the retrieval, having a size of two by the power of k (2 k ) elements. 
If an element in a trie node is in use, it refers either to a trie node at the next 
level in the directory tree or to a bucket. In other cases, the element is free 
(empty). 

Search in the database proceeds by examining the search key 
25 (which in the case of a subscriber database in a mobile telephone network or a 
telephone exchange, for instance, is typically the binary numeral correspond- 
ing to the telephone number of the subscriber) k bits at a time. The bits to be 
searched are selected in such a way that at the root level of the structure (in 
the first trie node), k leftmost bits are searched; at the second level of the 
30 structure, k bits next to the leftmost bits are searched, etc. The bits to be 
searched are interpreted as an unsigned binary integer that is employed di- 
rectly to index the element array contained in the trie node, the index indicating 
a given element in the array. If the element indicated by the index is free, the 
search will terminate as unsuccessful. If the element refers to a trie node at the 
35 next level, k next bits extracted from the search key are searched at that level 
in the manner described above. As a result of comparison, the routine 
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branches off in the trie node either to a trie node at the next level or to a 
bucket If the element refers to a bucket containing a key, the key stored 
therein is compared with the search key. The entire search key is thus com- 
pared only after the search has encountered a bucket. Where the keys are 

5 equal, the search is successful, and the desired data unit is obtained at the 
storage address indicated by the pointer of the bucket. Where the keys differ, 
the search terminates as unsuccessful. 

A bucketless trie structure has no buckets, but reference to a data 
unit is effected from a trie node at the lowest level of a tree-shaped hierarchy, 

10 called a leaf node. Unlike buckets, the leaf nodes in a bucketless structure 
cannot contain data units but only pointers to data units. Also a bucket struc- 
ture has leaf nodes, and hence trie nodes containing at least one pointer to a 
bucket (bucket structure) or to a data unit (bucketless structure) are leaf 
nodes. The other nodes in the trie are internal nodes. Trie nodes may thus be 

15 either internal nodes or leaf nodes. By means of buckets, the need for reor- 
ganizing the directory structure can be postponed, as a large number of point- 
ers/data units can be accommodated in the buckets until a time when the need 
for reorganization arises. 

The solution in accordance with the invention can be applied to a 

20 bucket structure as well as a bucketless structure. In the following, bucket 
structures will nevertheless be used as examples. 

Figure 1 illustrates an example of a digital trie structure in which the 
key has a length of 4 bits and k=2, and thus each trie node has 2 2 =4 elements, 
and two bits extracted from the key are searched at each level. Buckets are 

25 denoted with references A, B, C, D...H...M, N, 0 and P. Thus a bucket is a 
node that does not point to a lower level in the tree. Trie nodes are denoted 
with references IN1...IN5 and elements in the trie node with reference NE in 
Figure 1. 

In the exemplary case of Figure 1, the search keys for the buckets 

30 shown are as follows: A=0000, B=0001, C=0010 H=0111,... and P=1111. 

In this case, a pointer is stored in each bucket to that storage location in the 
database SD at which the actual data, e.g. the telephone number of the perti- 
nent subscriber and other information relating to that subscriber, is to be found. 
The actual subscriber data may be stored in the database for instance as a 
35 sequential file of the type shown in the figure. The search is performed on the 
basis of the search key of record H, for example, by first extracting from the 
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search key the two leftmost bits (01) and interpreting them, which delivers the 
second element of node IN1 , containing a pointer to node IN3 at the next level. 
At this level, the two next bits (11) are extracted from the search key, thus 
yielding the fourth element of that node, pointing to record H. 

5 Instead of a pointer, a bucket may contain (besides a search key) 

an actual data file (also called by the more generic term data unit). Thus for 
example the data relating to subscriber A (Figure 1) may be located in bucket 
A, the data relating to subscriber B in bucket B, etc. Thus in the first embodi- 
ment of an associative memory, a key-pointer pair is stored in the bucket, and 

10 in the second embodiment a key and actual data are stored, even though the 
key is not indispensable. 

The search key may also be multidimensional. In other words, it 
may comprise a number of attributes (for example the family name and one or 
more forenames of a subscriber). Such a multidimensional trie structure is 

15 disclosed in international application No. PCT/FI95/00319 (published under 
number WO 95/34155). In said structure, address computation is performed in 
such a way that a given predetermined number of bits at a time is selected 
from each dimension independently of the other dimensions. Hence, a fixed 
limit independent of the other dimensions is set for each dimension in any 

20 individual node of the trie structure, by predetermining the number of search 
key bits to be searched in each dimension. With such a structure, the memory 
circuit requirement can be curbed when the distribution of the values of the 
search keys is known in advance, in which case the structure can be imple- 
mented in a static form. 

25 If the possibility of reorganizing the structure in accordance with the 

current key distribution to be optimal in terms of efficiency and storage space 
occupancy is desired, the size of the nodes must vary dynamically as the key 
distribution changes. When the key distribution is uniform, the node size may 
be increased to make the structure flatter. On the other hand, with non-uniform 

30 key distributions in connection with which storage space occupancy will pres- 
ent a problem in memory structures employing dynamic node size, the node 
size can be maintained small, which will enable locally a more uniform key 
distribution and thereby smaller storage space occupancy. Dynamic changes 
in node size presuppose implementation of address computation in such a 

35 way that in each node of the tree-shaped hierarchy constituted by the digital 
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trie structure, a node-specific number of bits is selected from the bit string 
constituted by the search keys employed. 

The choice between a fixed node size and a dynamically changing 
node size is dependent for example on for what kind of application the memory 
5 is intended, for example what the number of retrievals, insertions and deletions 
to be made in the database is and what the proportions of these operations 
are. 

Irrespective of whether a fixed or changing node size is used in the 
memory, memories based on the digital trie structure are nevertheless at- 
10 tended by the problem of how the empty space inevitably created in the struc- 
ture can be modelled in such a way that storage space occupancy will be as 
low as possible and memory efficiency (speed of memory operations) as good 
as possible. 

1 5 Summary of the Invention 

It is an objective of the present invention to provide a solution to the 
above problem. This objective is achieved with the method defined in the in- 
dependent claims. The first of these discloses a structure employing buckets 
and the second a structure not employing buckets. 

20 The basic idea of the invention is to compress such nodes in a 

digital trie structure that provide only a single path downward in a tree-shaped 
hierarchy. The data needed to proceed in the structure and for reorganization 
of nodes is stored in such a compressed node, without any storage space 
being required for (an) element array(s). 

25 On account of the solution of the invention, the empty space pres- 

ent in the trie structure can be modelled in such a way that storage space 
occupancy in the structure will remain small with uniform as well as non- 
uniform key distributions. Furthermore, the solution enables the number of 
memory references requiring computation time to be minimized, thus making 

30 the efficiency (speed) of the memory as good as possible. 

In accordance with a preferred embodiment of the invention, each 
chain made up by successive compressed nodes is replaced with a single 
collecting node. This enables elimination of chains made up by successive 
compressed nodes as a result of limited word length. Elimination of chains will 

35 further improve memory efficiency and curb the need for storage space. 
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The solution in accordance with the invention also ensures effective 
performance of set operations, as the structure is an order-preserving digital 
trie. 



5 Brief Description of the Drawings 

In the following the invention and its preferred embodiments will be 
described in closer detail with reference to examples in accordance with the 
accompanying drawings, in which 

10 Figure 1 illustrates the use of a unidimensional digital trie structure in the 
maintenance of subscriber data in a telephone exchange, 
Figure 2 shows a multidimensional trie structure, 
Figure 3 shows a memory structure in accordance with the invention, 
Figure 4 illustrates implementation of address computation in the memory of 
15 the invention, 

Figure 5 illustrates the structure of a trie node of the memory when the 

memory employs dynamic node size, 
Figures 6a and 6b illustrate the principle of forming a compressed node, 
Figures 7a and 7b show an example of the maintenance of the memory struc- 
20 ture, 

Figure 8 illustrates the structure of a compressed node employed in the 
memory, 

Figure 9a illustrates the limitation posed by the word length employed on 

combining the nodes, 
25 Figure 9b shows the structure of a collecting node to be formed from the node 

chain of Figure 9a, and 
Figure 10 shows the memory arrangement in accordance with the invention 

on block diagram level. 



30 Detailed Description of the Invention 

As stated previously, in the present invention the trie structure has a 
multidimensional (generally n-dimensional) implementation. Such a multidi- 
mensional structure is otherwise fully similar to the unidimensional structure 
described at the beginning, but the element array contained in the trie node is 
35 multidimensional. Figure 2 exemplifies a two-dimensional 2 2 *2 1 structure, in 
which one dimension in the element array comprises four elements and the 
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other dimension two elements. Buckets pointed to from the elements in the trie 
node are indicated with circles in the figure. 

Address computation in the multidimensional case is performed on 
the same principle as in the unidimensional case. The fundamental difference, 
5 however, resides in that instead of a single element arrau index, an index is 
calculated for each dimension in the element array (n indices). Each dimension 
thus has a search key space of its own {0, 1 2M}(v, is the length of the 

search key in bits in each dimension and i e {1,...n}). 

The size of the trie node in the direction of each dimension is 2* ele- 

10 ments, and the total number of elements S in the trie node is also a power of 

two: 

S = Y\2 ki = 2 k 'x2 k2 x2 k >x.. =2" (1) 

All elements in a trie node having n dimensions can thus be pointed 
to by n integers (n>2), each of which may have a value in the range 
15 {0,1. ..2^-1}. Thus the predetermined fixed parameter is the total length of the 
search key in each dimension. If for example one dimension of the search key 
has 256 attributes (such as first names) at most, the total length of the search 
key is 8 bits. 

Figure 3 shows an example of a node N10 used in the directory 

20 structure of the memory in accordance with the invention, employing a three- 
dimensional search key. In the direction of the first dimension (x), the trie node 
has 2 2 =4 elements, in the direction of the second dimension (y) 2 1 =2 elements, 
and in the direction of the third dimension (z) 2 3 =8 elements, which gives a 
total of 2 6 =64 elements in the trie node, numbered 0...63. 

25 Since the memory space in practical hardware implementations (for 

example computer equipment) is unidimensional, the multidimensional array is 
linearized, i.e. converted to be unidirectional, in the address computation op- 
eration (that is, in proceeding in the directory tree). The linearization is an 
arithmetic operation that can be performed on arrays of all sizes. Hence, it is ir- 

30 relevant whether the trie nodes or their element arrays are considered to be 
unidimensional or multidimensional, as multidimensional arrays are linearized 
in any case to be unidimensional. 

In linearization, the elements in the array are numbered starting 
from zero (as shown in Figure 3), the number of the last element being one 

35 less than the product of the sizes of all dimensions. The number of an element 
is the sum of the products of each coordinate (for example in the three- 
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dimensional case, the x, y and z coordinates) and the sizes of the dimensions 
preceding it. The number thus computed is employed directly to index the 
unidimensional array. 

In the case of the trie node shown in Figure 3, the element number 
5 VAn is calculated in accordance with the above with the formula: 

V/V= x+yx4+zx4x2 (2) 

where xe{0,1,2,3}, ye{0,1} and ze{0,1 ,2,3,4,5,6,7}. Thus for example for ele- 
ment 54 we obtain from the coordinates thereof (2,1,6): 2+1x4+6x4x2= 
2+4+48=54. 

10 When the (n-dimensioned) element array of a trie node of an n- 

dimensional trie structure is linearized, in accordance with the above the size 
of each dimension is 2\ where kj is the number of bits to be searched at a time 
in the dimension concerned. If a coordinate in accordance with the dimension 
is denoted by reference a, (je{0,1,2...n}), the linearization can be written out as 

15 J> y ft2\v;:fl^^^ (3) 

Jm\ f-0 

The linearization can be carried out by performing a multiplication in 
accordance with formula (3); yet it is expedient to perform the linearization by 
forming from the search key bits a bit string by known methods, the corre- 
sponding numeral indicating the element whose content provides the basis for 
20 proceeding in the directory tree. Such a linearization method is termed bit 
interleaving. Bit interleaving is a more efficient (rapid) method than the multipli- 
cation in accordance with formula (3), since when bit interleaving is used mul- 
tiplications will be converted to additions and bit shifts, which are faster to 
perform. 

25 The most common way to implement bit interleaving is the 'z order- 

ing'. Another possible bit interleaving method is the line ordering. In the pres- 
ent invention, it is advantageous to use line ordering, as it affords the most 
efficient address computation in memory searches, but any known bit inter- 
leaving method may be employed, as long as the same method is employed in 

30 all nodes of the structure. 

Figure 4 illustrates an example of address computation performed 
in the trie structure in accordance with the invention. In the figure, it has been 
presumed that the memory employs dynamically changing node sizes and that 
the space is three-dimensional (dimensions x, y and z). It has further been 
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presumed that search key a* in the direction of dimension x is a x = 011011, 
search key ay in the direction of dimension y is ay = 1 10100 and search key a 2 
in the direction of dimension z is a z = 101010. The search keys are listed one 
below another in the figure. 

5 In the nodes of the trie structure, the indexing bits of a uni- 

dimensional element array are shown in frames denoted by continuous lines. 
These frames illustrate how a global search key is divided into local search 
keys (element array indices), each being used in one node of the trie structure. 
All frames denoted by continuous lines relate to the first bit interleaving 

10 method, i.e. the z ordering. The nodes in the structure are denoted by refer- 
ences N1...N7 in the order of progress. In the first node (N1) (at the upper- 
most level) only a single bit is employed, which is the leftmost bit in the search 
key of dimension x (which is a logical zero). Thereafter the routine proceeds in 
the direction of the arrow to the next node (N2), in which the number of bits 

1 5 forming the local search key is two. These are the leftmost bit in search key ay 
and the leftmost bit in search key a r In z ordering, the order of the bits is al- 
ways as presently shown, in other words, the first bit of the first dimension is 
first extracted, thereafter the first bit of the second dimension, thereafter the 
first bit of the third dimension, etc. After the first bit of the last dimension, the 

20 second bits are extracted from the different dimensions, starting from the first 
dimension. In this way, the following node-specific element array indices are 
obtained: 0 (node N1), 11 (node N2), 110 (node N3), 10 (node N4), 1010 
(node N5), 10 (node N6) and 1100 (node N7). 

Alternatively, some other known bit interleaving method, such as 

25 line ordering, may be employed in the memory. In Figure 4, the frames de- 
noted by broken lines and the arrows pertaining to them illustrate the forming 
of an element array index in node N5, the memory employing bit interleaving 
with line ordering. In the example of the figure, it has further been presumed 
that progress has been made in nodes N1...N4 so far that the first bit 

30 searched in node N5 is the third from the left in the search key in dimension z. 
In line ordering, all bits of each dimension are extracted at a time. 

When line ordering is employed, the minimum number of bits to be 
extracted from the search keys of the different dimensions is first calculated in 
the node. This is obtained by dividing the number of bits searched in the node 

35 by the number of the dimensions and by truncating the obtained result to the 
closest integer. In this exemplary case, the number of bits to be searched in 
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node N5 is four and the number of dimensions three, which gives a minimum 
number of one (that is, at least one bit must be extracted from the search key 
of each dimension). Thereafter it is still to be calculated how many additional 
bits must be extracted from the search keys of the different dimensions. The 

5 number of additional bits A is obtained from the formula A= k mod n, where k 
is the number of bits to be searched in the node and n is the number of dimen- 
sions. In this exemplary case, the result is A= 4 mod 3 = 1. The result 1 thus 
means that one additional bit is to be extracted. Extraction of additional bits is 
always started from the first searched dimension. In this exemplary case, one 

1 0 additional bit is thus extracted from the search key of dimension z. If the result 
had been two, one additional bit from the search key of dimension z and one 
additional bit from the search key of dimension x would have been extracted. 

Hence, in this exemplary case one bit from the search key of each 
dimension and additionally one bit from the search key of dimension z is ex- 

1 5 traded. Since in employing line ordering all bits of a dimension are extracted at 
a time, all bits (10) to be taken from dimension z are extracted first, thereafter 
all bits (0) to be used from the search key of dimension x, and lastly all bits (1) 
to be extracted from the search key of dimension y. Thus, when line ordering is 
employed, the bit string 1001 is obtained as the element array index of node 

20 N5; this bit string is depicted in the lower portion of Figure 4. 

Since in the memory of the invention the address computation is 
performed by using bit interleaving known per se, the address computation will 
not be describer in further detail. 

Since the order of bits in the local search key (element array index) 

25 to be formed in each node is constant, only the number of bits to be used must 
be known in the bit string formation performed in each node. This data is 
stored in each node. In addition, only an element array must be present in 
each ordinary trie node. Figure 5 illustrates the structure of an ordinary trie 
node when dynamically changing node size is employed. In its minimum con- 

30 figuration, the node thus comprises only two parts: a field indicating the num- 
ber of bits to be searched in the node (reference 51) and an element array 
(reference 52), the number of elements in the array corresponding to a power 
of two. For proceeding in the directory tree, in addition to the number of bits to 
be searched the type of each node must be known. This data can be stored in 

35 the directory structure for example in each node or in the pointer of the parent 
of the node. By means of the two "extra" bits of the pointer (a and b, Figure 5), 
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information can be encoded in the pointer on whether a zero pointer (an empty 
element) is concerned or whether the pointer points to an ordinary trie node, a 
bucket or a compressed trie node (which will be described hereinbelow). The 
encoding may be for example of the type shown in the figure. 
5 In the case of a bucketless structure, information on whether the 

pointer points to an uncompressed node, a compressed node or a data unit is 
stored. 

If fixed node size is employed in the memory, the number of 
searched bits need not be stored in the node. In this case, therefore, the node 

1 0 does not necessarily contain but an element array. 

To minimize storage space occupancy and to improve memory 
efficiency, compressed nodes are formed from the nodes of the trie structure in 
certain cases. If an ordinary trie node has only one child, this means that only 
one path downward in the tree passes through said trie node. In accordance 

15 with the invention, a trie node containing only a single pointer (path downward) 
is replaced with a compressed node in which the number of bits searched in 
said path and the computed array index value are disclosed. Since it is ad- 
vantageous from the point of view of storage space occupancy to form com- 
pressed nodes from single-child trie nodes throughout the entire memory 

20 structure, compression also means that at least two child nodes are always 
maintained for ordinary (uncompressed) trie nodes in the memory structure, 
that is, an individual (ordinary) trie node has pointers to at least two different 
lower-level nodes (child nodes). A compressed node replaces one or more 
successive internal nodes, each of which has one child, and hence the above- 

25 stated one child cannot be a bucket (or a leaf in a structure that has no buck- 
ets). Hence, a child node must be an ordinary trie node in order for compres- 
sion to be possible. From the point of view of optimizing storage space, it is 
thus advantageous to always maintain at least two child nodes for trie nodes 
preceding a bucket as well (i.e., if the bucket is preceded by a trie node having 

30 a size of two elements, said trie node always has two child nodes). 

The memory in accordance with the invention thus comprises two 
types of trie nodes: ordinary trie nodes containing an element array in accor- 
dance with Figure 5, and compressed nodes that will be described in the fol- 
lowing. 

35 Figures 6a and 6b illustrate the principle of forming a compressed 

node. For simplicity, all nodes are presumed to have a size of two elements. 
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Figure 6a shows a trie structure comprising six nodes, having only one path for 
the five uppermost nodes. This trie structure of five nodes can be replaced with 
one element array shown in Figure 6b. Since the structure has a single path 
for these nodes, only one element of the anay is in use, which in this exem- 
5 plary case is element 18 circled in the figure (18=01010 when the bits are 
taken in line order, i.e. the x bits first and thereafter the y bits). Thus, for the 
five uppermost nodes the trie structure can be replaced with a compressed 
node in which the number of bits to be searched (5) and the value of the array 
index (18) are stored. 

10 Figures 7a and 7b show a local maintenance example when data 

units and associated keys are deleted from a database. Figure 7a shows an 
initial situation in which the memory structure comprises trie nodes 
N111...N113 and buckets L2...L4. Thereafter bucket L2 and the pointer/record 
contained therein is deleted from the memory, as a result of which nodes N1 1 1 

15 and N112 can be replaced with a compressed node CN, in which the index of 
the pointer contained in the node and the number of bits searched in the path 
replaced by the compressed node are disclosed. Hence, the compressed 
node is in principle similar to an ordinary trie node, but instead of the entire 
large-size element array with only one pointer being stored, the index of the 

20 pointer concerned and the number of bits searched in the path are stored. This 
creates the compressed node CN in accordance with Figure 7b, in which the 
number of bits searched in said path (3) and the index corresponding to said 
pointer (101=5 when bit interleaving with line ordering is used) are disclosed. A 
compressed node thus has a virtual array replacing the information contained 

25 in the one or more node arrays existing in the path. If the compressed node 
replaces several ordinary trie nodes, the number of searched bits indicated in 
the compressed node is equal to the sum of the numbers of bits searched in 
the replaced nodes. 

Figure 8 illustrates the structure of a compressed node. The mini- 

30 mum configuration of the node comprises 3 parts: field 120 indicating the 
number of searched bits, field 121 storing the value of the array index, and 
field 122 storing a pointer to a child node. The compressed node is in need of 
this data in order for the search to proceed with the correct value at the com- 
pressed node as well, and in order for the restructuring of the node to be pos- 

35 sible in connection with changes in the memory structure. (Without information 
on the number of searched bits, the array index value cannot be calculated 
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from the search key, and on the other hand without the array index value the 
calculated value could not be compared to the value stored in the node.) 

If a collision occurs in the compressed node in connection with an 
insertion, i.e. the compressed node will have a new pointer, it is studied which 

5 bit in order distinguishes the index of the initial pointer and the index of the 
new pointer. Accordingly, a structure replacing the initial compressed node is 
created, in which the new compressed node comprises the index bit number 
insofar as there are common bits. In addition, one or more trie nodes are cre- 
ated in the structure at points corresponding to those bits in which the indices 

1 0 differ from one another. 

If the compressed node is preceded by one or more compressed 
nodes or a chain of trie nodes providing only a single path, it is advantageous 
in view of storage space requirement and memory efficiency to further com- 
bine said nodes. Moreover, in view of memory efficiency it is advantageous to 

1 5 carry out the combination of nodes in such a way that only in the compressed 
node that is the last (lowest) in the chain the number of searched bits is 
smaller than the word length in the computer used. In other words, nodes are 
combined in such a way that the number of searched bits will be as large as 
possible in each compressed node. For example, three successive com- 

20 pressed nodes in which the numbers of searched bits are 5, 10 and 15 can be 
combined into one compressed node in which the number of searched bits is 
30. Likewise, for example three successive compressed nodes (or three suc- 
cessive ordinary trie nodes providing only one path) in which the numbers of 
searched bits are 10, 10 and 15 can be combined into two compressed nodes 

25 in which the numbers of searched bits are 32 and 3, the word length employed 
being 32. Hence, it is attempted to obtain in as many compressed nodes as 
possible a number of searched bits corresponding to the word length of the 
computer, and the possible "superfluous" bits are left for the compressed node 
that is lowest in the hierarchy. 

30 However, compressed nodes cannot be combined so as to make 

the number of bits searched in one node higher than the word length in the 
computer employed. Particularly in multidimensional cases (n>3), it has been 
found to be common that there are so many successive nodes containing one 
child that the path cannot be represented by a single compressed node. 

35 Therefore, the search path or part thereof is replaced with a chain made up by 
several successive compressed nodes, in which the number of searched bits 
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is the same as the word bit number, for example 32 in the Intel architecture, 
except for the last node where the number of bits is smaller than or equal to 
the word bit number. 

Such a situation is depicted in Figure 9a t showing three successive 

5 compressed nodes CN1...CN3. The numbers of bits searched in the nodes 
are denoted by references b, b' and b" and the values of the array indices 
contained in the nodes with i ( i' and i", respectively. In the two uppermost 
nodes, the number of searched bits has a maximum value (providing that a 32- 
bit computer architecture is used). 

10 It is advantageous to form from a chain of several successive com- 

pressed nodes resulting from limited word length a single node collecting such 
compressed nodes. This collecting node is formed in such a way that the 
pointer of the collecting node is set to point to the child of the compressed 
node that is last in said chain, the sum of the numbers of bits searched in the 

15 compressed nodes in the chain is set as the number of bits B searched in the 
collecting node, and the array indices (i.e. search words) produced by bit in- 
terleaving are inserted in the list or table T of the node in the order in which 
they appear in the successive compressed nodes. Thus, the collecting node 
will be a node CN4 as shown in Figure 9b, comprising three parts: field 130 

20 containing a pointer to said lower-level node, field 131 containing the number 
of searched bits B (the above sum), and list or table T containing in succession 
the array indices produced by bit interleaving. This third part thus has a varying 
size. In the example of the figure, the number of indices is three, since the 
example of Figure 9a comprises three successive nodes. 

25 The number of elements (i.e., indices) EN in table T is obtained 

from the number of searched bits B as follows: 

, B/WJf B MODW = 0 
EN= l lBIW}+Uf BMODW*0 

where L J is a floor function truncating decimals from the number, W 
is the word length used, e.g. 32, and MOD refers to modulo arithmetic. Thus, 
30 the number of indices need not be stored in the collecting node as separate 
data, but it can be found on the basis of the number of searched bits. 

The number of bits B' needed to calculate the last index in the table 
(denoted by reference b" in the figure), which does not necessarily equal the 
word length, is obtained as follows: 
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, W 9 if BMODW = 0 
5 ' = i BMODWJf BMODW*0 

By forming a collecting node from several successive compressed 
nodes, the number of memory references (pointers) can be reduced further. In 
present-day computer architecture, comprising caches of various levels, mem- 

5 ory references require considerable computation time, and hence the compu- 
tation time will be diminished. At the same time, the need for storage space for 
pointers is eliminated. 

By means of compressed nodes, the storage requirement can be 
effectively minimized particularly in conjunction with non-uniform key distribu- 

10 tions, since by means of compression the depth of the structure can be arbi- 
trarily increased on a local basis without increased storage space requirement. 
Except for in conjunction with non-uniform key distributions, nodes containing 
one child will also be created in conjunction with uniform key distributions 
when the n-dimension of the structure is sufficiently large. 

15 As was already indirectly stated above, in the memory in accor- 

dance with the invention a bucket cannot be preceded by a compressed node, 
but the parent node of a bucket is always either an ordinary trie node or an 
empty element. Hence, a compressed node cannot point to a bucket, but it 
always points either to another compressed node or to an ordinary trie node. 

20 An empty element means that if the total number of records is smaller that the 
number of pointers/records that the bucket can accommodate, a tree-shaped 
structure is not needed yet, but one bucket will suffice in the structure (in which 
case said node is conceptually preceded by an empty element). It is advanta- 
geous to proceed in this way at the initial phase of starting up the memory. It is 

25 thus worth-while starting building up the tree-shaped structure only when this 
is necessary. 

In other respects, the retrievals, insertions and deletions to be car- 
ried out in the memory are performed in a manner known per se. In this re- 
gard, reference is made e.g. to the international application mentioned at the 
30 beginning, providing a more detailed description of collision situations in asso- 
ciation with insertions, for example. Instead of conventional deletion updating, 
the memory may also employ functional updating implemented by known 
methods by copying the path from root to buckets. 

As already stated at the beginning, the above-described compres- 
35 sion principle also relates to a bucketless trie structure. In such a case, the 
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equivalent of a bucket is a data unit (to which a leaf node in the bucketless 
structure points). 

Figure 10 shows a memory in accordance with the invention on 
block diagram level. Each dimension has a dedicated input register, and hence 

5 there is a total of n input registers. The search key of each dimension is stored 
in these input registers, denoted by references Rv-.Rn, each key in a register 
of its own. The input registers are connected to a register TR in which the 
above-described search word is formed in accordance with the bit interleaving 
method employed. The register TR is connected via adder S to the address 

10 input of memory MEM. The output of the memory in turn is connected to ad- 
dress register AR the output of which in turn is connected to adder S. Initially 
the bits selected from each register are read into the common register TR in 
the correct order. The initial address of the first trie node is first stored in the 
address register AR, and the address obtained as an offset address from 

15 register TR is added to the initial address in adder S. The resulting address is 
supplied to the address input of the memory MEM, and the data output of the 
memory provides the initial address of the next trie node, the address being 
written into the address register AR over the previous address stored therein. 
Thereafter the next selected bits are again loaded from the input registers into 

20 the common register TR in the correct order, and the array address thus ob- 
tained is added to the initial address of the relevant array (i.e., trie node), ob- 
tained from the address register AR. This address is again supplied to the 
address input of the memory MEM, the data output of the memory thereafter 
providing the initial address of the next node. The above-described procedure 

25 is repeated until the desired point has been accessed and recordal can be 
performed or the desired record read. 

Control logic CL attends to the compression and to the correct 
number of bits being extracted from the registers in each node. If dynamically 
changing node sizes are employed in the memory, the control logic also at- 

30 tends to maintenance of node sizes. 

The rapidity of the address computation can be influenced by the 
type of hardware configuration chosen. Since progress is by way of the above- 
stated bit manipulations, address computation can be accelerated by shifting 
from use of one processor to a multiprocessor environment in which parallel 

35 processing is carried out. An alternative implementation to the multiprocessor 
environment is an ASIC circuit. 
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Even though the invention has been described in the above with 
reference to examples in accordance with the accompanying drawings, it is 
obvious that the invention is not to be so restricted, but it can be modified 
within the scope of the inventive idea disclosed in the appended claims. Com- 

5 pression may, for example, be implemented in part of the memory only. The 
structure may also be implemented for keys of variable length. As was already 
stated at the beginning, the solution can be applied regardless of whether fixed 
or changing node size is employed in the memory. Hence, when the appended 
claims recite that in the node a given number of bits is selected from the bit 

10 string made up by the search keys employed, this shall be construed to cover 
both alternatives. Also, the address computation may continue in the bucket, 
providing that unsearched bits remain. The definition of a bucket given at the 
beginning is thus to be broadened to read that a bucket is a data structure that 
may also contain another trie structure. Hence, several directory structures in 

15 accordance with the present invention can be linked in succession in such a 
way that another directory structure (that is, another trie structure) is stored in 
a bucket, or a pointer contained in a bucket or a leaf points to another directory 
structure. Reference from a bucket or a leaf is made directly to the root node of 
the next directory structure. Generally, it may be stated that a bucket contains 

20 at least one element so that the type of an individual element is selected from 
a group comprising a data unit, a pointer to a stored data unit, a pointer to 
another directory structure and another directory structure. The detailed im- 
plementation of buckets is dependent on the application. In many cases, all 
elements in buckets may be of the same type, being e.g. either a data unit or a 

25 pointer to a data unit. On the other hand, for instance in an application in which 
character strings are stored in the memory the bucket may contain element 
pairs in such a way that all pairs in the bucket are either pointer to data 
unit/pointer to directory structure pairs or data unit/pointer to a directory struc- 
ture pairs or data unit/directory structure pairs. In such a case, for example, the 

30 prefix of the character string may be stored in the data unit and the search 
may be continued from the directory structure that is the pair of the data unit. 
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Claims: 

1. A method for implementing a memory, in which memory data is 
stored as data units for each of which a dedicated storage space is assigned 
in the memory, in accordance with which method 

5 - the memory is implemented as a directory structure comprising a 

tree-shaped hierarchy having nodes at several different hierarchical levels, 
wherein an individual node can be (i) a trie node comprising an array wherein 
an individual element may contain the address of a lower node in the tree- 
shaped hierarchy and wherein an individual element may also be empty, the 

10 number of elements in the array corresponding to a power of two, or (ii) a 
bucket containing at least one element so that the type of an individual ele- 
ment in the bucket is selected from a group including a data unit, a pointer to a 
stored data unit, a pointer to a node in another directory structure and another 
directory structure, 

15 - address computation performed in the directory structure com- 

prises the steps of 

- (a) selecting in the node at the uppermost level of the tree-shaped 
hierarchy a given number of bits from the bit string formed by the search keys 
employed, forming from the selected bits a search word with which the ad- 

20 dress of the next node is sought in the node, and proceeding to said node, 

- (b) selecting from the unselected bits in the bit string formed by the 
search keys employed a given number of bits and forming from the selected 
bits a search word with which the address of a further new node at a lower 
level is sought from the array of the node that has been accessed, 

25 - repeating step (b) until an empty element is encountered or until 

the address of the new node at a lower level is the address of a bucket, 
characterized in that 

in at least part of the directory structure, sets of successive trie 
nodes are replaced with compressed nodes in such a way that an individual 

30 set made up by successive trie nodes, from each of which there is only one 
address to a trie node at a lower level, is replaced with a compressed node 
(CN) storing an address to the node that the lowest node in the set to be re- 
placed points to, information on the value of the search word by means of 
which said address is found, and information on the total number of bits from 

35 which search words are formed in the set to be replaced. 



WO 98/41933 



PCT/FI98/00192 



18 



2. A method as claimed in claim 1, characterized in that 
replacement is carried out throughout the entire directory structure in such a 
way that all said sets are replaced with compressed nodes. 

3. A method as claimed in claim 1, characterized in that 
5 replacement is also carried out on a set comprising only one trie node, the total 

number of bits to be stored corresponding to the number of bits from which a 
search word is formed in said trie node. 

4. A method as claimed in claim 1, characterized in that 
several successive compressed nodes are formed in the directory structure in 

10 such a way that at least in the compressed node at the uppermost level a 
number of search key bits to be searched corresponding to the word length 
employed is collected. 

5. A method as claimed in claim 1, characterized in that 
several successive compressed nodes are combined into one new com- 

15 pressed node, the number of bits stored in the new node being the sum of the 
numbers obtained from the nodes to be combined. 

6. A method as claimed in claim 4, characterized in that a 
chain made up by successive compressed nodes wherein the number of bits 
searched in at least two uppermost nodes corresponds to the word length 

20 employed, is replaced with one collecting node (CN4) comprising 

- an address to the node to which the lowest node in the chain 
contains an address, 

- the sum of the numbers of searched bits obtained from the nodes 
in the chain, and 

25 - the search word values contained in the chain nodes in sequence. 

7. A method as claimed in claim 1, characterized in that in 
all uncompressed trie nodes of the memory, at least two addresses to a lower- 
level node are maintained. 

8. A method for implementing a memory, in which memory data is 
30 stored as data units for each of which a dedicated storage space is assigned 

in the memory, in accordance with which method 

- the memory is implemented as a directory structure comprising a 
tree-shaped hierarchy having nodes at several different hierarchical levels, 
wherein an individual node can be (i) an internal node comprising an array 

35 wherein an individual element may contain the address of a lower node in the 
tree-shaped hierarchy and wherein an individual element may also be empty, 
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the number of elements in the array corresponding to a power of two, or (ii) a 
leaf containing at least one element the type of which is one from a group 
including a pointer to a stored data unit and a pointer to a node in another 
directory structure, 

5 - address computation performed in the directory structure com- 

prises the steps of 

- (a) selecting in the node at the uppermost level of the tree-shaped 
hierarchy a given number of bits from the bit string formed by the search keys 
employed, forming from the selected bits a search word with which the ad- 

1 0 dress of the next node is sought in the node, and proceeding to said node, 

- (b) selecting from the unselected bits in the bit string formed by the 
search keys employed a given number of bits and forming from the selected 
bits a search word with which the address of a further new node at a lower 
level is sought from the array of the node that has been accessed, 

15 - repeating step (b) until an empty element is encountered or until 

the address of the new node at a lower level is the address of a leaf, 
characterized in that 

in at least part of the directory structure, sets of successive internal 
nodes are replaced with compressed nodes in such a way that an individual 

20 set made up by successive internal nodes, from each of which there is only 
one address to an internal node at a lower level, is replaced, with a com- 
pressed node (CN) storing an address to the node that the lowest node in the 
set to be replaced points to, information on the value of the search word by 
means of which said address is found, and information on the total number of 

25 bits from which search words are formed in the set to be replaced. 

9. A method as claimed in claim 8, characterized in that 
replacement is performed in the entire directory structure in such a way that all 
said sets are replaced with compressed nodes. 

10. A method as claimed in claim 8, characterized in that 
30 replacement is also carried out on a set comprising only one internal node, the 

total number of bits to be stored corresponding to the number of bits from 
which a search word is formed in said internal node. 

11. A method as claimed in claim 8, characterized in that 
several successive compressed nodes are formed in the directory structure in 

35 such a way that at least in the compressed node at the uppermost level a 
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number of search key bits to be searched corresponding to the word length 
employed is collected. 

12. A method as claimed in claim 8, characterized in that 
several successive compressed nodes are combined into one new com- 

5 pressed node, the number of bits stored in the new node being the sum of the 
numbers obtained from the nodes to be combined. 

13. A method as claimed in claim 11, characterized in that 
a chain made up by successive compressed nodes wherein the number of bits 
searched in at least two uppermost nodes corresponds to the word length 

1 0 employed is replaced with one collecting node (CN4) comprising 

- an address to the node to which the lowest node in the chain 
contains an address, 

- the sum of the numbers of searched bits obtained from the nodes 
in the chain, and 

15 - the search word values contained in the chain nodes in sequence. 

14. A method as claimed in claim 8, characterized in that in 
all uncompressed internal nodes of the memory, at least two addresses to a 
lower-level node are maintained. 
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